Computing the Expected Value and Variance of Geometric Measures
نویسندگان
چکیده
Let P be a set of points in R, and let M be a function that maps any subset of P to a positive real number. We examine the problem of computing the exact mean and variance of M when a subset of points in P is selected according to a well-defined random distribution. We consider two distributions; in the first distribution (which we call the Bernoulli distribution), each point p ∈ P is included in the random subset independently, with probability π(p). In the second distribution (the fixed-size distribution), a subset of exactly s points is selected uniformly at random among all possible subsets of s points in P . This problem is a crucial part of modern ecological analyses; each point in P represents a species in d-dimensional trait space, and the goal is to compute the statistics of a geometric measure on this trait space, when subsets of species are selected under random processes. We present efficient exact algorithms for computing the mean and variance of several geometric measures when point sets are selected under one of the described random distributions. More specifically, we provide algorithms for the following measures: the bounding box volume, the convex hull volume, the mean pairwise distance (MPD), the squared Euclidean distance from the centroid, and the diameter of the minimum enclosing disk. We also describe an efficient (1− ε)-approximation algorithm for computing the mean and variance of the mean pairwise distance. We implemented three of our algorithms: an algorithm that computes the exact mean volume of the 2D bounding box in the Bernoulli distribution, an algorithm that computes the exact mean and variance of the MPD for d-dimensional point sets in the fixed-size distribution, and an (1 − ε)-approximation algorithm for the same measure. We conducted experiments where we compared the performance of our implementations with a standard heuristic approach used in ecological applications. We show that our implementations can provide major speedups compared to the standard approach, and they produce results of higher precision, especially for the calculation of the variance. We also compared the implementation of our exact MPD algorithm with the corresponding (1− ε)-approximation method; we show that the approximation method performs faster in certain cases, while also providing high-precision approximations. We thus demonstrate that, as an alternative to the exact algorithm, this method can also be used as a reliable tool for ecological analysis. ∗Center for Massive Data Algorithmics, a Center of the Danish National Research Foundation.
منابع مشابه
Estimation of portfolio efficient frontier by different measures of risk via DEA
In this paper, linear Data Envelopment Analysis models are used to estimate Markowitz efficient frontier. Conventional DEA models assume non-negative values for inputs and outputs. however, variance is the only variable in these models that takes non-negative values. Therefore, negative data models which the risk of the assets had been used as an input and expected return was the output are uti...
متن کاملFace Recognition Based Rank Reduction SVD Approach
Standard face recognition algorithms that use standard feature extraction techniques always suffer from image performance degradation. Recently, singular value decomposition and low-rank matrix are applied in many applications,including pattern recognition and feature extraction. The main objective of this research is to design an efficient face recognition approach by combining many tech...
متن کاملEvaluation of Similarity Measures for Template Matching
Image matching is a critical process in various photogrammetry, computer vision and remote sensing applications such as image registration, 3D model reconstruction, change detection, image fusion, pattern recognition, autonomous navigation, and digital elevation model (DEM) generation and orientation. The primary goal of the image matching process is to establish the correspondence between two ...
متن کاملSOME RESULTS OF MOMENTS OF UNCERTAIN RANDOM VARIABLES
Chance theory is a mathematical methodology for dealing with indeterminatephenomena including uncertainty and randomness.Consequently, uncertain random variable is developed to describe the phenomena which involveuncertainty and randomness.Thus, uncertain random variable is a fundamental concept in chance theory.This paper provides some practical quantities to describe uncertain random variable...
متن کاملComputing the Matrix Geometric Mean of Two HPD Matrices: A Stable Iterative Method
A new iteration scheme for computing the sign of a matrix which has no pure imaginary eigenvalues is presented. Then, by applying a well-known identity in matrix functions theory, an algorithm for computing the geometric mean of two Hermitian positive definite matrices is constructed. Moreover, another efficient algorithm for this purpose is derived free from the computation of principal matrix...
متن کاملDynamic Cross Hedging Effectiveness between Gold and Stock Market Based on Downside Risk Measures: Evidence from Iran Emerging Capital Market
This paper examines the hedging effectiveness of gold futures for the stock market in minimizing variance and downside risks, including value at risk and expected shortfall using data from the Iran emerging capital market during four different sub-periods from December 2008 to August 2018. We employ dynamic conditional correlation models including VARMA-BGARCH (DCC, ADCC, BEKK, and ABEKK) and c...
متن کامل